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Why are the laws ol physics formulated in terms of complex Hilbert spaces? Arc there natural 
and consistent modifications of quantum theory that could be tested experimentally? This book 
chapter gives a self-contained and accessible summary of our paper [New J. Phys. 13, 063001, 2011] 
addressing these questions, presenting the main ideas, but dropping many technical details. We 
show that the formalism of quantum theory can be reconstructed from four natural postulates, 
which do not refer to the mathematical formalism, but only to the information-theoretic content 
of the physical theory. Our starting point is to assume that there exist physical events (such 
as measurement outcomes) that happen probabilistically, yielding the mathematical framework of 
"convex state spaces". Then, quantum theory can be reconstructed by assuming that (i) global 
states are determined by correlations between local measurements, (ii) systems that carry the same 
amount of information have the same state spaces, (iii) reversible time evolution can map every 
pure state to every other, and (iv) positivity of probabilities is the only restriction on the possible 
measurements. 



I. INTRODUCTION 

By all standards, quantum theory is one of the most 
successful theories of physics. It provides the basis of 
particle physics, chemistry, solid state physics, and it is 
of paramount importance for many technological achieve- 
ments. So far, all experiments have confirmed its univer- 
sal validity in all parts of our physical world. Unfortu- 
nately, quantum theory is also one of the most mysterious 
theories of physics. 

In the text books, quantum theory is usually intro- 
duced by stating several abstract mathematical postu- 
lates: States are unit vectors in a complex Hilbert space; 
probabilities are given by the Born rule; the Schrodinger 
equation describes time evolution in closed systems, to 
name just some of them. As many students recognize - 
and experienced researchers over years of adaption some- 
times tend to forget - these postulates seem arbitrary and 
do not have a clear meaning. It is true that they work 
very well and are in accordance with experiments, but 
why are they true? How come that nature is described 
by these counterintuitive laws of complex Hilbert spaces? 

What at first sight seems to be a physically vacuous, 
philosophical question is in fact of high relevance to the- 
oretical physics, in particular for attempts to generalize 
quantum theory. There have been several attempts in 
the past to construct natural modifications of quantum 
theory - either to set up experimental tests of quantum 
physics, or to adapt it in a way which allows for easier 
unification with general relativity. However, modifica- 
tion of quantum theory turned out to be a surprisingly 
difficult task. 

A historical example is given by Weinberg's [1] non- 
linear modification of quantum theory. Only a few 
months after his proposal was published, Gisin [2] demon- 
strated that the resulting theory has an unexpected poi- 



sonous property: it allows for superluminal signalling. 
It can be shown in general that other proposals of this 
kind must face similar fate [44] . It seems as if the usual 
postulates of quantum theory are intricately intertwined, 
in a way such that modification of one postulate makes 
the combination of the others collapse into a physically 
meaningless - or at least problematic - theory. 

In this paper, we propose a way to overcome this situa- 
tion: we consider four natural information-theoretic pos- 
tulates that have a clear physical meaning, which when 
taken together, turn out to be equivalent to the usual 
postulates of quantum theory In particular, these pos- 
tulates do not refer to complex numbers, Hilbert spaces, 
or operators, but use only notions which make sense in 
terms of classical probability. They can loosely be stated 
as follows: 

1. The state of a composite system is characterized 
by the statistics of measurements on the individual 
components. 

2. All systems that effectively carry the same amount 
of information have equivalent state spaces. 

3. Every pure state of a system can be transformed 
into every other by continuous reversible time evo- 
lution. 

4. In systems that carry one bit of information, all 
measurements which give non-negative probabili- 
ties are allowed by the theory. 

Below, we show how to derive the usual formalism of 
quantum theory from these postulates. Surprisingly, 
the complex numbers and Hilbert spaces pop out even 
though they are not mentioned in the postulates. This 
will also allow us to gain a better understanding of the 
usual quantum formalism, and resolve some of the mys- 
tery around ad hoc postulates like the Born rule. 



Our result suggests an obvious method to obtain nat- 
ural modifications of quantum theory: drop one of the 
postulates that we propose, and work out mathematically 
what the resulting set of theories looks like. In contrast 
to the usual formulation of quantum theory, we know for 
sure that the resulting alternative theories exist and are 
consistent - for example, they do not allow for superlumi- 
nal signalling as in Weinberg's approach. In a way, those 
theories are "quantum theory's closest cousins" : they are 
not necessarily formulated in terms of Hilbert spaces, but 
they are physically and conceptually as close to quantum 
theory as possible. 

As the simplest possible modification, suppose we drop 
the word "continuous" from Postulate 3 - that is, we al- 
low for discrete reversible time evolution. Unsurprisingly, 
another solution in addition to quantum theory appears: 
in this additional theory, states are (discrete) probability 
distributions, and reversible time evolution is given by 
permutations of outcomes. This is exactly classical prob- 
ability theory in the discrete case. It turns out to be the 
unique additional solution in this case. 

This book chapter is a summary of our paper [3] , which 
in turn is part of a wave of axiomatizations of quantum 
theory which arose partly in reaction to the development 
of quantum information theory. This modern approach 
to reconstruction was pioneered by Hardy [4], followed 
by Dakic and Brukner's work [5], and then in rapid 
succession by our result [3], the reconstruction by the 
Pavia group [6], and another elaboration by Hardy [7, 8]. 
Clearly, the attempt to axiomatize quantum theory dates 
back much further, including attempts by Birkhoff and 
von Neumann [9], Mackey [10], or Ludwig [11]. From a 
more mathematical angle, there has been extensive work 
on classifying the state spaces of operator algebras [12]. 

Every axiomatization has its own benefits. We think 
that the main advantage of our work - as described be- 
low - is its parsimony: our postulates are rather weak, 
possibly even close to optimal. Thus, one may expect 
that dropping one or two of the postulates will allow to 
discover other theories that share many interesting fea- 
tures with quantum theory, but still describe a different 
kind of physics. 



II. WHAT DO WE MEAN BY "QUANTUM 
THEORY"? 



When talking about axiomatizing quantum theory, 
there is sometimes confusion about what we actually 
mean by it. The term "quantum theory" arouses as- 
sociation with many different aspects of physics that are 
usually treated in quantum mechanics text books, such as 
particles, the hydrogen atom, three-dimensional position 
and momentum space and many other things. 

However, a more careful definition should apply here. 
As an analogy, consider the theory of statistical mechan- 
ics. This theory consists of an application of probability 
theory to mechanics, which means in particular that ab- 



stract probability theory can be studied detached from 
statistical physics - and this has been done in mathemat- 
ics for a very long time. 

Similarly, we can consider quantum mechanics to be a 
combination of an abstract probabilistic theory - quan- 
tum theory - and classical mechanics. Abstract quantum 
theory can be studied detached from its mechanical real- 
ization; the main difference to the previous example lies 
in the historical fact that the development of quantum 
mechanics preceded that of abstract quantum theory. In 
this terminology, we understand by "quantum theory" 
the statement that 

• states are vectors (or density matrices) in a complex 
Hilbert space, 

• probabilities are computed by the Born rule resp. 
trace rule, 

• the possible reversible transformations are the uni- 
tarics, 

• measurements are described by projection opera- 
tors, and thus observables are given by self-adjoint 
matrices. 

The "classical mechanics" part, on the other hand, de- 
termines the type of Hilbert space to consider (such as 
L 2 (M 3 )), the choice of "Hamiltonians" H which gener- 
ate the time evolution, U(t) = cxp(iHt), and the choice 
of initial states of that time evolution. This conceptual 
distinction has proven particularly useful in the develop- 
ment of quantum information theory. It seems that this 
distinction was always implicit when expressing the de- 
sire to "quantize" any classical physical theory, that is, 
to combine it with abstract quantum theory. 

Thus, since we are aiming for a reconstruction of ab- 
stract quantum theory, we will not refer to position, mo- 
mentum, or Hamiltonians in this paper. Instead, we only 
use the notions of abstract probability theory: of events, 
happening with certain probabilities, and of transforma- 
tions modifying the probabilities. Furthermore, we re- 
strict our analysis to finite-dimensional systems: we ar- 
gue that the main mystery is why to have a complex 
Hilbert space at all. If this is understood in finite di- 
mensions, it seems only a small conceptual (though possi- 
bly mathematically challenging) step to guess the correct 
infinite-dimensional generalizations. 

Since we presuppose probabilities as given, we also do 
not address the question where these probabilities come 
from. Hence we also ignore the question about what 
happens in a quantum measurement, and all other in- 
terpretations! mysteries encompassing the formulation of 
quantum theory. Instead, we restrict ourselves to ask 
how the mathematical formalism of quantum theory can 
be derived from simpler postulates, and what possible 
modifications of it we might hope to find in nature. 



Questions that we would like to address: 

• How can we understand (that is, derive) the 
complex Hilbert space formalism from simple 
assumptions on probabilities? 

• What other probabilistic theories are concep- 
tually closest to quantum theory? 

Questions that we do not address: 

• What is "probability"? 

• The measurement problem: What happens to 
a state during / after a measurement? 

• How can we interpret quantum mechanics? 



In order to formulate our postulates, we work with a 
simple and general framework encompassing all conceiv- 
able ways to formulate physical theories of probability: 
this is the framework of generalized probabilistic theories. 



III. GENERALIZED PROBABILISTIC 
THEORIES 



Classical probability theory (abbreviated CPT hence- 
forth) is used to describe processes which are not de- 
terministic. This is achieved by assuming a particu- 
lar mathematical structure: a probability space with a 
unique fixed probability measure, which is used to as- 
sign probabilities to all random variables. The frame- 
work of generalized probabilistic theories [4, 9, 10, 13- 
16] generalizes this approach in a simple way. We will 
now give a brief introduction to this framework, built on 
general considerations of what constitutes an experiment 
in physics. For more detailed introductions, we refer the 
reader to [14, 15], and for nice presentations of the main 
ideas to [19, 20]. 

In order to set up a common picture, we consider Fig- 
ure 1 as the model for what constitutes a physical ex- 
periment. This is just an illustration: the events that 
we describe are arbitrary, and may as well be natural 
processes that happen without human or technological 
intervention. 

The main idea (cf. Figure 1) is that physical systems 
can cause objective events for example clicks of detec- 
tors. We say that two systems are in the "same state" ui 
if all outcome probabilities of all possible measurements 
are the same. In order to test this empirically, we always 
assume that we can prepare a physical system in a given 
state as often as we want. That is, we may think of a 
preparation device which produces a physical system in 
a particular state. 
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FIG. 1: General experimental set up. From left to right 
there are the preparation, transformation and measurement 
devices. As soon as the release button is pressed, the prepa- 
ration device outputs a physical system in the state specified 
by the knobs. The next device performs the transformation 
specified by its knobs (which in particular can be "do noth- 
ing"). The device on the right performs the measurement 
specified by its knobs, and the outcome (x or x) is indicated 
by the corresponding light. 



A. States and measurements 

Single outcomes of measurements are called effects, 
and are denoted by uppercase letters such as E. The 
probability of obtaining outcome E, if measured on state 
ui, will be denoted E{u>). This way, effects become maps 
from states to probabilities in [0, 1]. 

What can we say about the set of all possible states ui 
in which a given system can be prepared? Suppose we 
have two preparation devices; one of them prepares the 
system in some state w, the other one prepares it in some 
state <p. Then we can use these devices to construct a new 
device, which tosses a coin, and then prepares cither state 
ui with probability p £ [0, 1], or state ip with probability 
1 — p. We denote this new state by 

U>' := puj + (1 — p)<p. 

Clearly, if we apply a measurement on u>' , we get outcome 
E with probability 

E(J) = pE{uj) + (1 - p)E(<p). 

Thus, by this construction, we see that states ui become 
elements of an affine space, and effects E are affine maps. 
The set of all possible states - called the state space S - 
will be a subset of this affine space. We have just seen 
that ui € S and tp £ S imply puj + (1 — p)ip G S if 
< p < 1; that is, state spaces are convex sets (similar 
reasoning is given in [17]). 

In principle, state spaces can be infinite-dimensional 
(and in fact, in many physical situations, they are). 
However, in this paper, we will only consider finite- 
dimensional state spaces. Then, states ui are determined 
by finitely many coordinates, and we may use this to con- 
struct a more concrete representation of states. Denote 
the dimension of a state space S by d. Then, by choosing 
d affincly independent effects E\, . . . , E<i, the probabili- 
ties Ei(u>), . . . , Ed{ui) determine u uniquely. We now use 
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The choice of E\ , . . . , E^ is arbitrary, subject only to the 
restriction that they are afhncly independent. We call a 
set of effects with this property fiducial, and we refer to 
Ei(u>), . . . , Ed{<jj) as fiducial outcome probabilities. The 
component luo '■= 1 has been introduced for calculational 
convenience: it allows us to write the affine effects E as 
linear functionals on the larger space R d+1 . It will also 
turn out to be particularly useful in calculations involving 
composite state spaces. 

In the following, we will assume that state spaces S 
are topologically closed and bounded, i.e. compact (for a 
physical motivation see [3]). The extremal points of the 
convex set S will be called pure states; these are states 
u! which cannot be written as mixtures pip + (1 — p)<p' 
of other states <p ^ <p' with < p < 1. It follows from 
the compactness of S that every state can be written as 
a convex combination of at most d+1 pure states [18]. 

Measurements with n outcomes are described by a 
collection of n effects E\ , E 2 , ■ . . , E n with the property 
E\ (u>) + E 2 (w) + . . . + E n (u) = 1 for all states u>. This ex- 
presses the fact that outcome i happens with probability 
Ei(ui), and the total probability is one. Note that two ef- 
fects E and F can only be part of the same measurement 
if E(u>) +F(ui) < 1 for all states u>. Sets of fiducial effects 
(as introduced above) do not necessarily have this prop- 
erty. A single effect E is always part of a measurement 
with two outcomes E and E, where E(ui) := 1 — E(u>) 
(we will use this bar-notation frequently). 

Figure 2 gives some examples of convex state spaces. 
First, consider a classical bit, which is described within 
CPT. We can think of a coin which shows either heads 
or tails; in general, it can be in one of those configura- 
tions with some probability. The probability p of show- 
ing heads determines the state uniquely, since the tails 
probability must be 1 — p. Thus, p G [0, 1] is a fidu- 
cial probability; recalling (1), we can represent states as 
u! = [l,p] T . This yields a one-dimensional state space, 
with two pure states [1,0] T and [1, 1] T , corresponding to 
coins which deterministically show heads or tails. It is 
depicted in Figure 2a). 

Similarly, classical n-level systems have states which 
correspond to probability distributions pi, . . . ,p n . Since 
Pn = I - (pi + ■■■ + Pn-i), the numbers pi, . . . ,p n -i 
are fiducial outcome probabilities, yielding states uj — 
[l,pi, . . . ,p n _i] T . Geometrically, the resulting state 
spaces are simplices. They are depicted in Figure 2b) 
and c) for n = 2 and n = 3. 

Quantum systems look very different: as it is well- 
known, states of quantum 2-level systems, i.e. qubits, can 
be parametrized by a vector r £ M 3, with |r| < 1, such 





FIG. 2: Examples of convex state spaces: a) is a classical 
bit, b) and c) are classical 3- and 4-level systems, d) is a 
quantum bit, e) is the projection of a qubit, f) and g) are 
neither classical nor quantum. Note that quantum n-level 
systems for n > 3 are not balls. 



that every density matrix can be written p = (l+r-a)/2, 
with a = (&x,o~ y ,o- z ) the Pauli matrices. Thus, we can 
use the vector [1, r' x , r' y ,r' z ] T to represent states, where 
r' { := (1 + rj)/2 is the probability to measure "spin up" 
in i-dircction. This state space is the famous (slightly 
reparametrized) Bloch ball, cf. Figure 2d). 

Figure 2e) shows a state space which is a projection 
of the Bloch ball: it corresponds to the effective state 
space that we obtain if, for some reason, spin measure- 
ments in ^-direction are physically impossible to imple- 
ment, with states u> — [1, r' x , r y ] T . The square state space 
in Figure 2f) describes a system for which there exist 
two independent effects, say X and Y, that can yield 
probabilities X(u>) and Y(u>) in [0, 1] arbitrarily and in- 
dependently from each other. States will be of the form 



lu = [l,u> x ,LUy] T , with lj x = X(lu) and cu y = Y(u>). 

Consider the two yes-no-measurements which corre- 
spond to the effects X and Y; we can interpret these 
as spin measurements in two orthogonal directions, with 
"yes" -outcome X or Y for "spin up", and "no" -outcome 
XorY for "spin down" . If we perform either one of these 
measurements on the state in the square u> = (1, 1, 1), 
then we will get the "yes" -outcome with unit probability 
- and this is true for both measurements. If we consider 
the analogous measurements on the circle state space, 
we see that the corresponding behaviour becomes impos- 
sible: if one of the spin measurements yields outcome 
"yes" with certainty, then the other spin measurement 
must give outcome "yes" with probability 1/2. This fol- 
lows from ri + r. 2 . < 1 . 

Thus, the circle state space shows a form of comple- 
mentarity, which is not present in the square state space. 
As this example illustrates, the state space of a physi- 
cal system can tell us everything about its information- 
theoretic properties. Given a description of the state 
space S, we can also determine the set of all linear func- 
tionals which map states to the unit interval [0, 1], that 
is, the candidates for possible effects. However, not all 



of them may be possible to implement in physics: maybe 
some of them are "forbidden" , similarly as supersclection 
rules forbid some superpositions in quantum mechanics. 
Therefore, to every given state space Sa, there is a set of 
"allowed effects" which are interpreted as those that can 
actually be physically performed. 

We introduce some notions which will be useful later: 
A set of states W\ , . . . , u) n is called distinguishable if there 
is a measurement with outcomes represented by effects 
Ei, ... , E n , such that Ei(uij) — Sij, which is 1 if i = j 
and otherwise. The interpretation is that we can build 
a device which perfectly distinguishes the different states 
ujj . Given a physical system A, we define the capacity Na 
as the maximal size of any set of distinguishable states 
loi, . . . ,u) n € Sa- A measurement which is able to dis- 
tinguish Na states (that is, as much as possible) will be 
called complete. 

For a quantum state space, Na equals the dimen- 
sion of the underlying complex Hilbert space. Follow- 
ing Wootters and Hardy [4, 21], we also use the notation 
Ka '■= dim (Sa ) + 1; this is the dimension of the sur- 
rounding linear space that carries Sa- For a qubit, for 
example, we have Na — 2, but Ka = 4. In quantum 
theory, Ka = N\ equals the number of independent real 
parameters in a density matrix (dropping normalization) . 
In classical probability theory, we always have Ka = Na ■ 



B. Transformations 

A transformation is a map T which takes a state to an- 
other state. Which transformations are actually possible 
is a question of physics. However, there are certain min- 
imal assumption that every transformation must neces- 
sarily satisfy in order to be physically meaningful in the 
context of convex state spaces. First, transformations 
must respect probabilistic mixtures - that is, 

T(pcu + (1 - p)<p) = P T(lu) + (1 - p)T(tp). 

This is because both sides of the equation can be inter- 
preted as the result of randomly preparing uj or ip (with 
probabilities p resp. 1—p) and applying the transforma- 
tion T. Thus, transformations (from one system to itself) 
are linear maps which map a state space S into itself. 

If both T and T~ l are physically allowed transforma- 
tions, we call T reversible. The set of reversible transfor- 
mations on a state space Sa is a group Q a- For physical 
reasons, we assume that Q a is topologically closed, hence 
a compact group [22] (it may be a finite group). 

Reversible transformations map a state space bijec- 
tively onto itself - hence they are symmetries of the state 
space. For example, in quantum theory, reversible trans- 
formations are the unitary conjugations, p t-» UpW. In 
the Bloch ball representation of the qubit (as in Fig- 
ure 2d)), these maps are represented as rotations, such 
that the group of reversible transformations is isomorphic 
to SO(3). 



However, as this example also shows, not all sym- 
metries are automatically allowed reversible transforma- 
tions: a reflection in the Bloch ball is a symmetry, but it 
is not an allowed transformation (in the density matrix 
picture, it would correspond to an anti-unitary map). 

In summary, for what follows, a physical system A is 
specified by three mathematical objects: the state space 
Sa, the group of reversible transformations Qa (which 
is a compact subgroup of all symmetries of Sa), and a 
set of physically allowed effects. The latter will not be 
given a particular notation, but we assume that the set of 
allowed effects is topologically closed. For obvious phys- 
ical reasons, if E is an allowed effect and T € Qa, then 
EoT is an allowed effect; similarly, convex combinations 
of allowed effects are allowed. 



C. Composite systems 

If we are given two physical systems A and B, we 
would like to define a composite system AB which is also 
a physical system in the sense described above, with its 
own state space Sab, group of reversible transformations 
Gab, and set of allowed effects. 

In contrast to quantum theory, the framework of gen- 
eral probabilistic theories allows many different possible 
composites for two given systems A and B. Every possi- 
ble composite AB has a set of minimal physical assump- 
tions that it must satisfy: 

• If io a € 5^4 and u>b € Sb are two local states, then 
there is a distinguished state lua^b € Sab which 
is interpreted as the result of preparing loa and lob 
independently on the subsystems A and B. 

• If Ea and Eb are local allowed effects on A and B, 
then there is a distinguished allowed effect EaEb 
on AB which is interpreted as measuring Ea on 
A and Eb on B independently, yielding the total 
probability that outcome Ea happens on system A, 
and outcome Eb happens on system B. 

• This intuition is mathematically expressed by de- 
manding that 

E a E b (uja^b) = Ea{u a )E b (ujb) 

where both EaEb and loa^b are affine in both ar- 
guments. This also formalizes the physical assump- 
tion that the temporal order of the local prepara- 
tions resp. measurements is irrelevant. 

From the previous point, we can infer that we can repre- 
sent independent local preparations u>a^>b and measure- 
ment outcomes EaEb by tensor products: 

E A E B = E A <S> E b , loa^b = uu ® u>b- 

Consider the joint state space Sab, which is contained in 
a linear space AB. We have inferred that 

A O B C AB. 



For the dimensions of these spaces, we obtain 
K A K B < K AB . 

Now consider two different measurements (for sim- 
plicity with two outcomes) E B ,E B '■= Is — E B and 
Fb,Fb := Is — Fb, where 1b denotes the trivial ef- 
fect on system B which yields unit probability on every 
normalized state. We can think of an agent Bob, hold- 
ing system B, who may decide freely (say, randomly in 
a way which is uncorrelated with A) whether to perform 
measurement Eb , Eb or Fb , Fb ■ 

Suppose that Alice (holding system A) performs some 
measurement after Bob has chosen and performed his 
measurement on a bipartite state oj ab . The marginal 
probability that she obtains (not knowing Bob's out- 
come) is the same in both cases: 

E a ®1 b {u A b) = E A ® E B {w AB ) + E A ® E b {oj A b) 
= E a ®F b (lu A b)+E a ®Fb(lo ab ). 

The same holds with the roles of A and B reversed. 
We have recovered the no-signalling property: Bob can- 
not send information to Alice merely by chosing his lo- 
cal measurement (and vice versa). Moreover, we have 
proven that Alice locally observes the reduced state uj a := 
Ma® 1b (was) (note that Id a is a linear transformation, 
while 1^ is a linear functional). This state is uniquely 
characterized by the equation 

E A (u} A ) = E a 1 b (uj A b) 

for all functionals (in particular, all allowed effects) E A . 
For physically meaningful composites AB, we should 
demand that reduced states uj a , ujb of all bipartite states 
uj ab G S A b are valid local states themselves. Instead, 
we will demand something which is stronger and con- 
tains this as a special case. Suppose that Alice and Bob 
share u> and Bob performs a measurement and obtains 
outcome Eb ■ Knowing this outcome leaves a conditional 
state uj a b at Alice's side, which by elementary probabil- 
ity theory satisfies 
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E A ®E B {u AB ) 
\ a ®E b (io A b) ' 



(2) 



We demand that uo A B <G S A for all allowed effects E B and 
all (jjab G S A b- The reduced state lo a an be written 

u A = \ul B + (1 - \)uol B 

with \ = \ A ® E B {u AB ); thus, co A G S A by convexity. 

In some situations, this condition is automatically sat- 
isfied, namely if all effects on A and B are allowed (recall 
that not all effects need to be physically possible to im- 
plement; above, we have discussed that possibly only a 
subset of effects might be physically allowed) . The proof 
will also illustrate that the cone of unnormalized states 
is a useful concept. 



Lemma 1. Suppose that A and B are state spaces such 
that all effects are allowed. Then, the inclusion of condi- 
tional states in the local state spaces follows directly from 
the fact that the composite state space AB contains all 
product states and effects. 

Proof. Define the cone of unnormalized states A + on A 

by 

A+ :={\uj a | uj A eS A ,X>0}. 

Since 1 A (Aw) = A for u G S A , a vector lu G A + is a 
normalized state, i.e. lo G S a , if and only if l A (co A ) = 1. 
The cone of unnormalized effects is 

A+ := {\E A | E a {lo a ) G [0, 1] for all lo a G S a }. 

Since we have said that all effects are allowed, every linear 
map E A : A — > K with E A (u>) G [0, 1] is an allowed effect. 
The set A + contains all multiples of those. Both sets A + 
and A + are closed convex cones [23] , where "cones" refers 
to the fact that if x is in the set, then Xx is also in the 
set for all A > 0. 

It is now easy to see that A + is the "dual cone" (^4+)* 
of A + , where 



(A, 



{E:A 



E{u) > for all u G A + }. 



Since (A + )** = A + , we get also that A + is the dual cone 
of A + ; in other words, 

A + = {u eA\ E{uj) > for all E G A+}. 

Recall the definition of the conditional state in (2). It 
follows directly from this definition that E a {lo a b ) > 
for all allowed effects E A , hence for all E A G A + . But 
then, we must have w A B G A + . Since 1 a (uj a b ) = 1, we 
get uj a b G S A . The same reasoning holds for B instead 
of A. □ 

Our state spaces also carry a group of reversible trans- 
formations. If G A G Q A is a reversible transformation on 
A, and Gb G Gb one on B, it is physically clear that 
we should be able to accomplish both transformations 
locally independently; i.e., G A <E) Gb G Q A b- We will as- 
sume that composite state spaces satisfy this condition. 

One of our postulates below will be the postulate of 
local tomography. This is an additional condition on 
composites AB which is sometimes, but not always im- 
posed in the framework of general probabilistic theories: 
It states that 

global states are uniquely determined by the statistics of 
local measurement outcomes. 

That is, if uj A b and ip A B are global states in S A b, then 
E A ® E b {uj A b) = E a ® E B ((p AB ) implies that uj ab = 
ip AB . But the part of AB which is "seen" by product 
effects E A <£> E B is exactly A®B. That is, the postulate 
of local tomography is equivalent to AB — A B, and 
thus to 

K AB - K A K B . 



Thus, we get some kind of "tensor product rule" for com- 
posite state spaces, including Iab = 1a® Is- Note that 
this is not as strong as the tensor product rule of quan- 
tum theory (which specifies the global states uniquely, 
giving the local Hilbcrt spaces). Classical probability 
theory satisfies this rule as well. Suppose that A is a 
classical bit, and B is a classical 3-level system. Then the 
composite AB is classical 6-level system, i.e. Kab = 6, 
while Ka — 2 and Kb — 3. We get Kab = KaKb, 
which is equivalent to local tomography. 

To see that we are still far beyond quantum theory, 
suppose that A and B arc both the square state space 
from Figure 2f ) . Then, define the global state space Sab 
as the set of all vectors x € AB with Ea Eb(x) € [0, f] 
for all effects Ea and Eg, and 1a 1b( x ) — 1 (nor- 
malization), ft turns out that this state space contains 
so-called PR-box states that violate the Bcll-CHSH in- 
equality by more than any quantum states [15]. The set 
of states Sab itself turns out to be the eight-dimensional 
no-signalling polytope for two parties with two measure- 
ments and two outcomes each. The fact that these state 
spaces can have stronger non-locality than quantum the- 
ory has been extensively studied [f4, f5, 25-29] and is 
a main reason for the popularity of general probabilistic 
theories in quantum information. 

It is also important to keep in mind that the conditions 
above do not determine the composite state space Sab 
uniquely, even if Sa and Sb are given. For example, 
if Sa and Sb arc quantum state spaces, then the usual 
composite quantum state space is a possible composite 
Sab, but there are other possibilities: one of them is to 
define Sab as the set of unentanglcd global states. It 
satisfies all conditions mentioned above. 



D. Equivalent state spaces 

In classical physics, choosing a different inertial coor- 
dinate system does not alter the physical predictions of 
Newtonian mechanics. A similar statement is true for 
convex states spaces. 

Consider a system A, given by a state space Sa, a 
group of transformations Ga, and some allowed effects. 
Suppose that B is another system, and suppose that 
there is an invertible linear map L : A — > B (where now A 
and B denote the linear spaces carrying the state spaces) 
such that 



Sb = L(Sa), 



• Ea is an allowed effect on A if and only if Ea ° L l 
is an allowed effect on B, 

• Gb = Lo Q A o L _1 . 

Then the systems A and B are physically indistinguish- 
able from each other - they describe the same type of sys- 
tem, just parametrized in different ways. We will then 
call A and B equivalent. This notion is obviously an 
equivalence relation. 



An example of two equivalent state spaces is given by 
a qubit B and the three-ball A. That is, the set of states 
Sb is the set of 2 x 2-density matrices, with the unitarics 
(acting by conjugation) as the group of reversible trans- 
formations Gb- The state space A is defined as the set 
of states <jj = [I , r\ T , where r is a vector with Euclidean 
norm |f| < 1 (as in Figure 2d)); the group of transfor- 
mations is Qa = SO(3). The corresponding linear map 
establishing the equivalence is L(uo) : = (r • 1 + r ■ <?)/2, 
where r§ denotes the first component of lo. 

Thus, in our endeavour to derive quantum theory, all 
we have to do is to prove that all state spaces satisfying 
our postulates are equivalent to quantum state spaces. 



IV. THE POSTULATES 

In this section, we describe our postulates and explain 
their physical meaning. We start with an axiom on com- 
posite state spaces that has already been mentioned in 
Subsection III C above: 

Postulate 1 (Local tomography). The state of a com- 
posite system AB is completely characterized by the 
statistics of measurements on the subsystems A,B. 

The name "local tomography" comes from the inter- 
pretation that state tomography on composite systems 
can be done by performing local measurements and sub- 
sequently comparing the outcomes to uncover correla- 
tions. As already mentioned, this postulate is equivalent 
to Kab = KaKb, where K a denotes the number of de- 
grees of freedom needed to specify an unnormalized state 
on A. 

Our second postulate formalizes a property of physics 
that physicists intuitively take for granted, and that is 
in fact used very often in performing real experiments. 
Imagine some physical three-level system (that is, with 
three perfectly distinguishable states and no more: N — 
3) that we can access in the lab (it might be quantum, 
classical, or describable within another theory). Now 
suppose that, for some reason, we have a situation where 
we never find the system in the third of the three distin- 
guishable configurations on performing a measurement. 

To have a concrete example, consider a quantum sys- 
tem that consists of three energy levels which can be 
occupied by a single particle. Suppose the system is con- 
structed such that the third energy level is actually never 
occupied (maybe because the corresponding energy is too 
high) . 

The consequence that we expect is the following: We 
effectively have a two-level system. This is definitely true 
for quantum theory, and classical probability theory, but 
it is not necessarily true for other generalized probabilis- 
tic theories. In general, for any number of levels (per- 
fectly distinguishable states) TV, we expect to have a cor- 
responding state space Sn ■ And the collection of states 
lu G Sn which has probability zero to be found in the 



N-ih level upon measurement should be equivalent to 
Sn-i- 

In actual physics, this property is used all the time: 
We apply "effective descriptions" of physical systems, by 
ignoring impossible configurations. Qubits manufactured 
in the lab usually actually correspond to two levels of a 
system with much more energy levels, set up in a way 
such that the additional energy levels have probability 
close to zero to be occupied. 

One may argue that physics would be in severe trou- 
ble if this property did not hold: we would then pos- 
sibly have to take into account unobservable potential 
configurations even if they are never seen. They would 
modify the resulting state space that we actually observe. 
The following "subspace postulate", first introduced by 
Hardy [4], formalizes this idea. It is actually somewhat 
stronger than our discussion motivates: it also implies 
that, for every N, there is a unique type of JV-level sys- 
tem Spj. 

The notions of complete measurements and equivalent 
state spaces are defined in Subsections III A and HID. 

Postulate 2 (Equivalence of subspaces). Let Sn and 
Sn-i be systems with capacities N and N — 1, respec- 
tively. If E\, ... , En is a complete measurement on Sn, 
then the set of states uj G Sn with En{uj) — is equiva- 
lent to iSjv-i- 

The notion of equivalence needs some discussion. Pos- 
tulate 2 states the equivalence of Sn-i and 



Si 



N-l - 



{cueS N I E N {w) = 0}. 



(3) 



Denote the real linear space which contains Sn by Vn', 
define Vn-\ analogously, and set V N _ 1 := span(5J v _ 1 ). 
Equivalence means first of all that there is an invertiblc 
linear map L : Vn-i — > V N _i such that L(Sn-i) = 
S' N _ 1 . But it also means that transformations and mea- 
surements on one of them can be implemented on the 
other. We now describe in more detail what this means. 

Every effect E on Sn defines an effect on S' N _ 1 by 
restricting it to the corresponding linear space, resulting 
in E \ V N _ 1 . Equivalence implies that the resulting set 
of effects is in one-to-one correspondence with the set of 
effects on Sn-i, as described in Subsection HID. 

The transformations on S / N _ 1 arc defined analogously. 
To be more specific, define G'n-i as the set of transforma- 
tions in Sn that preserve S' N _ 1 (or, equivalcntly, V N _ 1 ): 

G'n-i '■= {T eG N \ TS'n-i = Stv-i}- 

The set of reversible transformations G'n-i IS defined as 
the restriction of all these transformations to S' N _ 1 (or 
rather, as linear maps, to V N _ 1 ): 

S , N-i = {TWi f _ 1 \TeQ' N - 1 }. 

Equivalence means that 

Qn-i — Lo Qn-i o L~ . 



Concretely, if U € Gn-i is any reversible transforma- 
tion on a state space of capacity N — 1, then the trans- 
formation U := L o U o L^ 1 is a reversible transforma- 
tion on S' N _ 1 , i.e. U € G'n-i- As such, it can be writ- 



T 



S' N _ 1 for some reversible transformation 



ten U 
T£Gn- 

It is important to note that we don't have full infor- 
mation on T - that is, our postulate does not specify T 
uniquely, given U. By definition, T preserves S' N _ 1 and 
therefore the subspace V N _ X , but we do not know how it 
acts on the complement of that subspace it might act as 
the identity there, or it might have a non-trivial action. 
Postulate 2 does not specify this. In general, there may 
(and will) be different T which implement the same U on 
the subspace. 

Using Postulate 2 iteratively, we see that state spaces 
of smaller capacity are included (in the sense described 
above) in those of larger capacity; symbolically, 

Si £ S 2 C S 3 C . . . 

Our next postulate describes the idea that any actual 
physical theory of probabilities must allow for ample pos- 
sibilities of reversible time evolution. In situations where 
"no information is lost" - assuming that this situation 
applies to closed systems -, these systems A must evolve 
reversibly, that is, according to some subgroup of the 
group of reversible transformation Ga- Clearly, if this 
group is trivial (contains only the identity), physics be- 
comes "frozen": no reversible time evolution is possible 
at all. 

Postulate 3 proclaims a minimal amount of transfor- 
mational richness for reversible time evolution: as a min- 
imal requirement, it states that the group of reversible 
transformations should at least act transitively on the 
pure states. That is, if we prepare a pure state u>, and (p 
is another (desired) pure state on the same state space, 
then there should be a reversible transformation T which 
maps uj to ip: 

Postulate 3 (Symmetry). For every pair of pure states 
U),f € Sa, there is a reversible transformation T <G Ga 
such that Tuj = ip. 

It is easy to see that Postulate 3 is actually true for 
quantum theory: every pure state can be mapped to ev- 
ery other by some unitary. This example also shows that 
Postulate 3 is rather weak: in quantum theory, even tu- 
ples of perfectly distinguishable pure states uj\ , . . . , u) n 
can be mapped to other tuples tpi,...,tp n by suitable 
unitaries. This is a much higher degree of symmetry 
than what is directly demanded by Postulate 3. 

There is one postulate remaining. As we discussed in 
Subsection III A, given some state space Sa, not all ef- 
fects (i.e. linear functionals on A which are non-negative 
on Sa) may be physically allowed. Similarly as for su- 
perselection rules, it might be true that some effects are 
impossible to implement (an example would be a state 
space that allows only noisy measurements, and no out- 
come whatsoever occurs with probability zero). 



In order for our axiomatization to work, we need to 
postulate that this strange behaviour does not happen: 
that is, all mathematically well-defined effects corcspond 
in fact to allowed measurement outcomes. As it turns 
out, it is sufficient to postulate this for a 2-level system 
£2 (he. a generalized bit) only. In combination with the 
other postulates, it follows then for all other state spaces. 

Postulate 4 (All measurements allowed). All effects on 
S2 are outcome probabilities of possible measurements. 

From a mathematical point of view, this postulate 
could also be regarded as a background assumption: 
structurally, it says that the class of considered theories 
is the class of models where the effects are automatically 
taken as the dual of the states. In other words, it means 
that whenever we refer to "measurements" in the other 
postulates, we actually refer to collections of effects with- 
out considering the possibility that additional physical 
conditions might prevent their implementation. 

It is interesting to note that Postulate 4 can be re- 
placed by a different formulation, which has first been 
suggested in a paper by G. Chiribclla et al. [6]. It refers 
to "completely mixed states", which are states that are 
in the relative interior of the convex set of states: 

Postulate 4' (Ref. [6]). If a state is not completely 
mixed, then there exists at least one state that can be 
perfectly distinguished from it. 



HOW QUANTUM THEORY FOLLOWS 
FROM THE POSTULATES 



We are now ready to carry out the reconstruction of 
quantum theory (QT) from the postulates. As it turns 
out, there will be another solution to Postulates 1.-5., 
which is classical probability theory (CPT). By this we 
mean the theory where the states are finite probability 
distributions, and the reversible transformations are the 
permutations. Figure 2a)-c) shows what classical proba- 
bility distributions look like in terms of convex sets: they 
are simpliccs. 

Therefore, we will now prove the following theorem: 

Theorem 1 (Main Result). The only general probabilis- 
tic theories, satisfying Postulates 1.-4- above, are equiv- 
alent to one of the following two theories: 

• Classical probability theory (CPT): The state 
space is the set of probability distributions, 

Sn = {(pi,...,p N ) I Pi > 0,^Pi = 1}, 



and the reversible transformations Qn are the per- 
mutations on {!,..., N}. 



Quantum theory (QT): The state space Sn is 
the set of density matrices on N -dimensional com- 
plex Hilbert space, 



S 



N 



{/»* 



^NxN 



p>0, Trp=l}, 



and the group of reversible transformations Gn is 
the projective unitary group, that is, the set of maps 
pn UpW with WU = 1. 

In both cases, all effects must be allowed. Working 
out the set of effects (that is, linear functionals on states 
yielding values between and 1), one easily recovers the 
usual measurements of CPT and QT. 

In this paper, we will not give the full reconstruction 
in all details; the full proof can be found in our more 
technical paper [3]. Instead, we will try to give an easily 
accessible summary of the reconstruction, its main ideas, 
and some interesting observations in the course of the 
argument. 

Before starting to do this, let us discuss a simple 
observation regarding Theorem 1. In order to rule out 
CPT - and hence to single out QT uniquely - we can 
tighten Postulate 3 by replacing it with the following 
modification: 

Postulate 3C (Continuous symmetry) For every pair 
of pure states u, <p € Sa, there is a continuous family of 
reversible transformations {Gt}te[o,i] such that G$uj = to 
and G\lj = ip. 

In other words, every pure state can be "continuously 
moved" into every other pure state. A statement like this 
is expected to be true in physical systems with continuous 
reversible time evolution - which is the case that seems 
to be true, to good approximation, in our universe. The 
consequence is: 

The only general probabilistic theory that satisfies 
Postulates 1, 2, 3C, and 4, is quantum theory (QT). 



A. Why bits are balls 

In QT, the state space of a 2-level system (that is, a 
generalized bit, or qubit, £2) is a three-dimensional ball, 
the Bloch ball. In CPT, the (classical) bit instead is a 
line segment, as shown in Figure 2. In fact, this is a ball, 
too: it is a one-dimensional unit ball. However, quantum 
AT- level systems with N > 3 are not balls: they contain 
mixed states in their topological boundary [43]. 

We will now show that all theories satisfying our pos- 
tulates must have Euclidean ball states spaces as gener- 
alized bits. The dimension of this ball will not be deter- 
mined yet; this will be done later on. 

Our argument proceeds in two steps: first, we show 
that the state space £2 cannot have lines in its boundary; 
that is, we exclude the fact that S2 has proper faces as 
in the left picture of Figure 3. Using convex geometry 
langugage, we prove that £2 is strictly convex. 
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FIG. 3: Like every compact convex set, the bit state space 
52 contains pure states io e that are exposed - that is, there 
is an effect E e such that u e is the unique state where this 
effects attains value 1. Due to Postulate 2, this proves that 
iSi contains a single state only. Now suppose £2 had lines in its 
boundary, as in the left picture. Then we would analogously 
find another effect E that attains value 1 on a non-trivial 
face. Consequently, Postulate 2 would tell us that Si contains 
infinitely many states - a contradiction. Thus, £2 must be 
strictly convex as in the right picture. Euclidean ballness 
follows from group representation theory. 



As a second step, we show that the symmetry prop- 
erty, Postulate 3, enforces S 2 to be a Euclidean ball. The 
reason for this comes from group representation theory: 
since the group of transformations acts linearly, there is 
an inner product such that all transformations are or- 
thogonal with respect to it. 

Lemma 2. The state space of the generalized bit S 2 * s 
strictly convex. 

Proof. Consider any effect E with < E(lo) < 1 for all 
states lo G S 2 ■ Then this effect belongs to a two-outcome 
measurement (as defined in Subsection III A) , consisting 
of the two effects E and 1 — E. It is important to under- 
stand that the level sets {x \ E(x) — c] are hypcrplanes 
of codimension 1, due to linearity of E. This is true 
for all state spaces S. On the other hand, given some 
hyperplane, we can construct a corresponding effect E 
(with some freedom of offset and scaling) that has this 
hyperplane as its level set. 

Like every compact convex set, £2 has at least one 
pure state u> e which is exposed [24] - that is, there is a 
hyperplane which touches the convex set only in u e and 
in no other point. Thus, we can construct an effect E e 
such that the corresponding hyperplane is {x \ E e (x) = 
1}, i.e. E e (u! e ) = 1, and min a;e 5 2 E e (uj) = 0. But then, 
(E ei 1 — E e ) distinguishes two states perfectly, which is 
the maximal number for a bit - in other words, this is a 
complete measurement. 

Now Postulate 2 says that 



{coeS 2 I (1-E e )(u) 



0} = {lu g S 2 I E e (u) 
= {uj e } ~ Si. 



1} 



In other words, Si is a trivial state space which con- 
tains only a single state. Now suppose that S 2 would 
have lines in its boundary, and therefore non-trivial faces, 
as depicted on the left-hand side of Figure 3. Then we 



would find a supporting hyperplane that touches S 2 in 
infinitely many states. Constructing a corresponding ef- 
fect E and repeating the argument from above, we would 
analogously argue that Si must contain infinitely many 
states. This is a contradiction. □ 

Balls do not have lines in their boundary, but there are 
many other stricly convex sets - for example, imagine a 
droplet-like figure. However, Postulate 3 says that there 
is lots of symmetry in the state space S 2 : all pure states 
(which we now know means all states in the topological 
boundary) are connected by reversible transformations. 

From this, one can prove that 

Lemma 3. The state space S 2 is equivalent to a Eu- 
clidean ball (of some dimension d 2 := K 2 — 1). 

Recall that we denote the dimension of the set of un- 
normalized states in Sjv by K^; therefore, the set of nor- 
malized states has dimension Kjq — 1. We will not prove 
Lemma 3 here, but only sketch where it comes from. An 
important notion turns out to be the maximally mixed 
state. On any state space Sat, define fiN as a mixture 
over the group of transformations, 



llN 



GwdG, 



where lu G Sn is any pure state. This is an integral over 
the invariant measure of the group; see [30, 31] for details 
of its definition. It follows from the connectedness of all 
pure states (Postulate 3) that hn does not depend on 
the choice of the pure state to. Moreover, jjln turns out 
to be the unique state which is invariant with respect to 
all reversible transformations, 



Gfi 



N 



UN 



for all G G G 



N- 



All states lu G Sat span an affine space of dimension 
Kn — 1. We can now consider /in to be the origin of that 
affine space; then, reversible transformations G G Gn act 
linearly; they preserve the origin. By group represen- 
tation theory, there is an inner product on that space 
which is invariant with respect to all reversible transfor- 
mations. As a consequence, all pure states have the same 
norm with respect to this inner product. In the case of 
a bit, i.e. N = 2, this yields a sphere, containing all pure 
states, with the maximally mixed state /ipj as the center 
of the ball. 



B. The multiplicativity of capacity 

So far, we know that of we combine two state space 
A and B, the joint state space has dimension Kab = 
KaKb ~ this is due to Postulate 1, local tomography, 
as discussed in Subsection IIIC. However, we do not 
yet know whether the same equality is true for capac- 
ity N . An important step in the derivation of quantum 
theory is to prove this. As it turns out, a key insight is 
that the maximally mixed state must be multiplicative: 
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if we have two state spaces A and B, then the maximally 
mixed state on the composite system AB (assuming our 
postulates) is 

[i-AB = VA <8> Mb- 

This is easily proved from the fact that /iab must in 
particular be invariant with respect to all local reversible 
transformations, leaving fj,A <8> fJ-B as the only possibility. 
A further key lemma is the following: 

Lemma 4. If there are perfectly distinguishable pure 
states u>i,...,U n G S^ that average to the maximally 
mixed state, i.e. 



1 " 
Mat = -Y\ 

n -<- — ' 



f/ien n = N . 



Proof. Clearly, TV > n, since TV is the maximal number 
of perfectly distinguishable states. On the other hand, 
let ifi, . . . , <pjv be a set of perfectly distinguishable pure 
states on Sn, and E\, ... , _Eat the corresponding effects, 
i.e. Ei(ujj) = Sij. Since 1 = X)"=i -^»(Mat)> there must be 
some fc such that E k (/j, N ) < 1/TV. By Postulate 3, there 
is a reversible transformation G € Qn with Gcji = </?&. 
Thus 

1 1 " 

— > E k (p N ) = E k o G(pjv) = -J~]E k o G(oJi) 



TV 



i=l 



> -^oG( Wl )= -. 
n n 

Thus, we also have TV < n, proving the claim. D 

In quantum theory, the maximally mixed state on an 
TV-dimensional Hilbcrt space is the density matrix 



Mat 



Iat 

TV 



1 N 



»=i 



•.JV 



if \ipi), . . . , \?Pn) denotes an orthonormal basis of 
that is, if these are pure states that are perfectly distin- 
guishable. This is in agreement with Lemma 4. More- 
over, we can prove that an analogous formula holds for 
every theory satisfying our Posulates 1.-4.: 

Lemma 5. For every TV, there are TV pure perfectly dis- 
tinguishable states uj\ , . . . , lon G Sn such that 



Mat 



1 N 

-Y 



We only sketch the proof here: For TV = 1, the state- 
ment is trivially true, since Si contains only a single state. 
For TV = 2, we know that Sn is a Euclidean ball, with 
the maximally mixed state in the center. Thus, taking 



wi and u>2 as two antipodal points on the ball (say, north 
and south pole), we get 

A*2 = ^(wi +W2), 

and these states are perfectly distinguishable by an ana- 
logue of a quantum spin measurement. Now suppose we 
combine k of these generalized bit state spaces £2 into a 



joint state space, <S| 



So 



So ■ Then the maxi- 



mally mixed state on the resulting state space is 



[Xg®k 



1 



M2 



M2 = 



£ 



Wjj <8> 



U>i 



,ifc=l,2 



Since in locally tomographic composites, products of pure 



states are pure, the u^ 



u)i 



are all pure states, 



and they are perfectly distinguishable by product mea- 
surements. Thus, Lemma 4 shows that the capacity of 
<S® must be N s ®k = 2 fc . This proves Lemma 5 for all 
TV which are a power of two. For all other TV, the lemma 
is proved by using the fact that Sn is embedded in some 
iSf for some k large enough due to Postulate 2, and 
then constructing the maximally mixed state on Sn in a 
clever way from that on Sf k . 

Now we can just tensor together the equations 



N A 



'" v > = -w: J2^ and ^- = ^ E w 



N A 
and we obtain 

I^Nab = V-Na ® Hn e 



N B 

B 

Nb^'' 1 ' 



■| N A N B 



N A N B 



Ojf®U)f, 



i=l j=l 



and Lemma 4 tells us that capacity must be multiplica- 
tive: 

Lemma 6. Nab — NaNb- 

Why is this equation so important? As noticed by 
Hardy [4], it allows us to draw a surprising conclusion. 
Every state space Sn has unnormalized dimension Kn- 
Since Kab = KaKb and TV^s = NaNb for all state 
spaces A and B due to our postulates, we get the follow- 
ing facts: 

• The function TV 1-4 Kn maps natural numbers to 
natural numbers, and is strictly increasing due to 
Postulate 2. 



• It satisfies K 



N X N 2 



K Nl K N2 , and K 1 



1. 



As shown in [4] , these simple conditions imply that there 
must be an integer r > 1 such that 



K 



N 



N r . 



(4) 



Now recall that the dimension of the bit state space 
(which is a Euclidean ball) is d 2 := Kq, — 1. It follows 
that 

d 2 €{1,3,7,15,31,...} 
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since K 2 = 2 r for some r e N. Thus, we see in particular 
that the bit state space is an orfti-dimcnsional Euclidean 
ball. The next subsection will deal with the case d 2 = 1; 
as we will see, this case corresponds to classical proba- 
bility theory. 



C. How to get classical probability theory (CPT) 

Suppose that d 2 = K 2 — 1 = 1; that is, the generalized 
bit is a one-dimensional ball, as shown in Figure 2. A 
line segment like this describes a classical bit. What can 
we say about iV- level systems for TV > 3 in this case? 
Equation (4) tells us that the parameter r must be r = 1, 
and thus 

K N =N 

for all N, not only for N = 2. 

Choose TV perfectly distinguishable pure states 
wi, . . . , lun <G iS/v, and E\,..., En the corresponding ef- 
fects with Ei(ujj) = 5ij as well as J2i Ei — 1. It is easy 
to see that the states must be linearly independent; since 
K = N, they span the full unnormalized state space. 

Thus, every state u> can be written u> = J2i=i a i UJ i^ 
with oti <G R and ^2 i cm = l(w) = 1. But then, Ej(uj) = 
dj > 0, and so this decomposition of u is in fact a convex 
decomposition. 

In other words, the full state space Sn is a convex com- 
bination of Wi, . . . , ojn - that is, a classical simplex as in 
Figure 2a)-c) . These are exactly the state spaces of CPT. 
Moreover, since for TV = 2, we can permute the two pure 
states due to Postulate 3. We can use the subspace pos- 
tulate, that is Postulate 2, to conclude that every pair of 
pure states on Sn can be interchanged. These transpo- 
sitions generate the full permutation group, which must 
thus be the group of reversible transformations Q^. We 
have therefore proven the following: 

In the case d 2 — 1, we get classical probability theory as 
the unique solution of Postulates 1.-4- 



We know from Postulate 3 that for every pair of pure 
states w, ip € £2, there is a reversible transformation T G 
Q 2 with Tuj = ip. In other words, Q 2 acts transitively on 
the unit sphere, that is, the surface of the unit ball. It 
can be shown that this implies that Q 2 is itself transitive 
on the unit sphere. 

At first sight, it seems that this enforces Q 2 to be the 
full special orthogonal group SO(d 2 ), but this intuition 
is easily seen to be wrong. For example, the group of 
4 x 4-matrices 



re U imU 

-im U rcU 



U e SU(2) 



acts transitively on the surface of the 4-dimensional unit 
ball, even though it is a proper subgroup of SO (A). The 
set of all compact connected Lie matrix groups which act 
transitively on the unit sphere has been classified in [32- 
35]. In general, there are many possibilities. Fortunately, 
however, we have additional information: we know that 
the bit ball has odd dimension d 2 := K 2 — 1. It turns out 
that there remain only two possibilities: 



If d 2 ^ 7, then G% 



SO(d 2 



• If d 2 = 7, then Q 2 is either 50(7) or of the form 
MG 2 M~ 1 , where M is a fixed orthogonal matrix, 
and G 2 is the fundamental representation of the 
exceptional Lie group G 2 . 

In fact, d 2 — 7 appears in our list of possible dimensions 
of the bit ball, because 7 = 2 3 — 1. In our endeavor to 
derive quantum theory from Postulates 1.-4., wc will have 
to show that all the cases d 2 £ {7, 15, 31, . . .} violate at 
least one postulate. Thus, we see that the case d 2 = 7 
has to be (and is) treated separately. 

The appearance of d 2 — 7 as a special case seems like 
an almost unbelievable coincidence. Is there some deeper 
significance to this case? Might there be some interesting 
unknown theory waiting to be discovered which has 7- 
dimcnsional balls as bits and the exceptional Lie group 
G 2 as the analogue of local unitaries? We do not know. 



D. The curious 7-dimensional case 

Let us now consider the remaining cases, i.e. the cases 
where the dimension of the Euclidean bit ball is d 2 = 
K 2 -1 € {3,7,15,31...}. The generalized bit carries 
a group of reversible transformations Q 2 ; by our back- 
ground assumptions mentioned in Subsection IIIB, this 
must be a topologically closed matrix group. Closed sub- 
groups of Lie groups are Lie groups; therefore, Q 2 is itself 
a Lie group. Since it maps the unit ball into itself, it 
must be a subgroup of the orthogonal group. 

Denote by Q 2 the connected component of Q 2 contain- 
ing the identity matrix. We have 

G° 2 C SO{d 2 ). 



E. Subspace structure and 3-dimensionality 

Having discussed the case of classical probability the- 
ory with bit ball dimension d 2 = 1, the remaining cases 
are 

d 2 e {3,7,15,31,...}. 

We will now show that all dimensions d 2 > 7 are incom- 
patible with the postulates, leaving only the case d 2 = 3 
- that is, the Bloch ball of quantum theory. For the rest 
of this chapter, we ignore the special case d 2 = 7 with 
Q 2 = MG 2 M~ 1 and G 2 the exceptional Lie group; it can 
be ruled out by an analogous argument. 
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In the following, we will parametrize the single bit state 
space as 



S 2 



I) I^mMmi<i 



The maximally mixed state becomes /i = (1,0) T . Let 
n := (1,0,..., 0) T , then we have two pure states ui\ := 
(l,n) T and ui 2 '■= (l,—n) T , corresponding to the north 
and south pole of the ball. These states are pure, and 
they are perfectly distinguished by the measurement con- 
sisting of the two effects (for oj G S 2 ) 

E^lo) := (l + (w,n))/2, 
E 2 (us) := (l-<«,n»/2. 

We know that if we combine two bits into a joint state 
space, we obtain a state space of capacity four: 

<S 4 = S 2 ®S 2 . 

Thus, the product states uii®ujj with i,j = 1,2 represent 
four perfectly distinguishable states in 1S4, and the corre- 
sponding product effects Ei Ej constitute a complete 
measurement. Recall, however, that the joint state space 
that we sloppily denoted S 2 S 2 is not fully known so 
far - all we know is that the surrounding linear space is 
the tensor product of the local spaces. At this stage, we 
do not yet have a complete description of the set of all 
global states S4. 

Using the subspace postulate twice, i.e. Postulate 2, 
we obtain that the set of states ui with (E\ E\ + E 2 
E 2 )(u) = 1 is again equivalent to a single bit. This turns 
out to be a surprisingly restrictive requirement that we 
are now going to exploit. Denote this set of states by F 
(it is a face of the state space S4), then 

F = {ueS A \(E 1 ®E 1 + E 2 ® E 2 )(lu) = 1} ~ 5 2 . 

In the following, we will label the two bits by indices A 
and B for convenience. The group Q 2 = SO(d 2 ) contains 
a subgroup Q 2 which leaves the axis containing north and 
south pole invariant, i.e. 

Q s 2 := {GeQ 2 \ Gui = wi and Gu 2 = lo 2 } ~ SO(d 2 -l). 

If R € SO(d 2 — 1), then its action as an element of Q 2 is 



(i,j i \...,j d ^y -► (i, u w,r(u 



(2) 



,(<*2 



Suppose we apply one transformation of this kind on 
each part of a bipartite state uj locally; that is, a 
transformation Ga Gb with Ga,Gb € G 2 - Then 
we have (E\ E\ + E 2 E 2 )(lo) = 1 if and only if 
{E x E x + E 2 E 2 ){G A G B (co)) = 1. Thus, this 
transformation leaves the face F invariant: 

(G A G B )F = F. 

We know that the dimension of the linear span of F is 
d 2 + 1, since it is equivalent to 52- We will now explore in 



more detail how the transformations Ga Gb act on the 
face F. In particular, we are interested in the structure 
of invariant subspaces. 

First, consider a single bit. Its unnormalized states 
are carried by a real vector space A = M d2+1 that we can 
decompose in the following way: 



A 







w 



1 



w 



A\ 



where A' denotes the set of all vectors with first two 
components zero. Since /i = (1,0,..., 0) T and G/j, = fi, 
as well as wi = (1, 1,0, . . . ,0) T and Glji = w\ for all 
G <G G 2 , these three subspaces are all invariant. 

Consequently, the vector space which carries two bits, 
AB = A B, contains the subspace A' B' which is 
invariant with respect to all transformations G a Gb 
for Ga, Gb G Q 2 . This defines an action of SO(d 2 — 1) x 
SO(d 2 - 1) on the subspace A' B'. 

With a bit of work, one can show that the face F con- 
tains at least one state w which has non-zero overlap with 
A' B' . Denote the projection of that vector onto this 
subspace by uja'®b' ■ We know that every (Ga Gg)(u) 
is a valid state in the face F, and its component in the 
aforementioned subspace is (Ga Gb)(wa'®b')- Now 
imagine we apply all the local transformations Ga Gb 
to the vector uja'®b', and we are interested in the orbit 
- that is, in the set of all vectors that we can generate 
this way. 

lid 2 > 4, then the group 50(^2 — 1) has a nice property 
in terms of group representation theory [30]: It is irre- 
ducible. That is, its action on C d2_1 does not leave any 
non-trivial subspaces invariant. This allows us to draw 
an important conclusion: it implies [30] that the product 
group SO(d 2 — 1) x SO(d 2 — 1) is also irreducible. But 
then, the orbit (Ga Gb)(wa'®b') must span the full 
space A' B' , which has dimension (d 2 — l) 2 - this is a 
very large orbit. 

In fact, it is too large for the subspace postulate: 
above, we have concluded from Postulate 2 that the span 
of the face F (which is preserved by those local transfor- 
mations) must have dimension d 2 + 1, which is less than 
(d 2 — l) 2 if d 2 > 3. Thus, we obtain a contradiction: 
if the bit ball has dimension d 2 £ {7,15,31,...}, it is 
impossible to combine two bits into a joint state space 
which satisfies all our postulates. 

As it turns out, this is not true ii d 2 = 3: the group 
SO(d 2 — 1) = SO (2) leaves the span of (l,i) T invari- 
ant; that is, 50(2) is reducible. Thus, this case is not 
ruled out by the reasoning above. In group-theoretic 
terms, this reducibility is related to the fact that SO(2) 
is Abelian. In other words, the fact that rotations com- 
mute in 3 — 1 dimensions can be seen as a possible reason 
of the fact that the Bloch ball is Z-dimcnsional. 

Lemma 7. The dimension of the bit ball must be d 2 =3. 
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We have thus uncovered a group-theoretic explanation 
why the smallest non-trivial quantum systems have three 
mutually incompatible, independent components and not 
more. Due to Postulate 4, we can find all possible mea- 
surements on this state space: all effects (that is, linear 
functionals) which yield probabilities in the interval [0, 1] 
correspond to outcome probabilities of possible measure- 
ments. It is easy to see that these effects arc in one- 
to-one correspondence with the quantum measurements 
(POVMs) on a single qubit. 

Furthermore, we know that the group of reversible 
transformations contains 50(3), the rotations of the 
Bloch ball, which correspond to the unitary transforma- 
tions on a qubit. At this point, however, we do not yet 
know whether Q 2 = SO (3) or Q 2 — 0(3). 



F. Quantum theory on iV-level systems for N > 3 

In the previous section, we have derived quantum the- 
ory for single bits. It remains to show that our postulates 
also predict quantum theory for all TV-level systems with 
N > 3. As before, we only sketch the main proof ideas, 
and refer the reader to [3] for proof details. 

For a single bit in state u> = (l,(2i) T , we can obtain 
the usual representation as a density matrix by applying 
a linear map L : M. A — > C 2 * 2 , where the latter symbol 
denotes the real vector space of self-adjoint complex 2x2- 
matriccs. This map L is defined by linear extension of 

L(lo) :=(l+u-a)/2, 

where a — {cr x ,a y ,a z ) denotes the Pauli matrices. The 
representation that we obtain (applying L in a suitable 
way to effects and transformations as well) is equivalent 
in the sense of Subsection III D to the Bloch ball repre- 
sentation. 

If we have the state space S4 = S 2 S 2 °f two bits, 
we can use the map L L to represent states u) G S4 by 
self-adjoint 4 x 4-matrices £® L(oj). Recall that we have 
constructed a face F of S4 in the previous subsection. 
Analyzing F in a bit more detail, one can show that it 
contains a family of pure states uj u , where u € [0,7r), 
which are mapped by L L onto 



L(g)L(ui u ) = \ip u )(ip u \ 



where 



\^u) 



COS 



■sin -11) 

2 1 ' 



ID 



for some orthonormal basis {|0), |1)}. This is an entan- 
gled quantum state with Schmidt coefficients cos(w/2) 
and sin(w/2). Choosing u appropriately, they can attain 
any value between and 1. Thus, by applying local uni- 
taries (which corresponds to the 50(3)-rotations of the 
local balls), we can generate all pure quantum states. 

Denoting S' 4 :~ L £(£4), we have proven the follow- 
ing: 



Lemma 8. <S 4 contains all pure 2-qubit quantum states 
as pure states. 

The next step is somewhat tricky: we have to show 
that there are no further (non-quantum) states in S' 4 . 
The idea is to show that all quantum effects are al- 
lowed effects on £4. Then, if there were additional non- 
quantum states in £4, some of these effects would give 
negative probabilities, which is impossible. 

We know that the product effects are allowed on £4. 
Applying the transformation L L, some of the corre- 
sponding effects in S^ are the maps 

p ^ Tr (Pi P 2 p) , 

where P\ and P 2 are one-dimensional projectors. If 
T £ Q4 is any reversible transformation on £4, denote 
the corresponding transformation on <S 4 by X" G Q' A . It 
maps states p to T" ' (p). Suppose we could show the equa- 
tion 

Tr(Px P 2 T'{p)) = Tt((T')-\Pi ® Pi)p). (5) 

Then we would be done: due to Postulate 3, transforma- 
tions T" G G'i can map every pure product state to every 
other pure state, in particular, to every pure entangled 
quantum state. This way, (T') _1 in the equation above 
would generate all entangled quantum effects from the 
product effect P\ P 2 . This is exactly what we want. 

When would eq. (5) hold? Up to a factor 1/4, the map 
L® 2 is an isometry: for all x,y€R 4 ®R 4 , we have 

Tr(L® 2 (x)L^(y)) = ±(x,y). 

Thus, translating eq. (5) from <S 4 back to £4, the corre- 
sponding equation is 

(E 1 ®E 2i Tuj) = {T- 1 {E 1 ®E 2 ),uj). 

This is satisfied if T T = T~ x for all T e G 4 . In fact, we 
have 

Lemma 9. All reversible transformations T G Q 4 act as 
orthogonal matrices on R K . 

The proof of this lemma is non-trivial and somewhat 
surprising: it uses Schur's Lemma from group representa- 
tion theory, together with the fact that there exist certain 
kinds of SWAP and CNOT operations on two bits. These 
operations are constructed by using Postulate 2. 

Due to Lemma 9, all the above argumentation becomes 
solid: eq. (5) is valid, and we get 

Lemma 10. <S 4 is the set of 2-qubit quantum states, and 
the allowed effects are the quantum effects. 

So what about the transformations? First of all, we 
know that that the transformation group of a single bit 
must be 50(3) - it cannot be 0(3), because local re- 
flections would correspond to partial transposition which 
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generates negative eigenvalues on entangled states. Fur- 
thermore, every transformation T € Q4 is a linear isom- 
ctry on the set of self-adjoint 4 x 4-matrices that maps 
the set of density matrices into itself. 

According to Wigner's Theorem [36, 37], only uni- 
tary and anti-unitary maps satisfy this. However, due 
to Wigner's normal form, anti-unitary maps generate re- 
flections in some Bloch ball faces of the state space, which 
is impossible due to Postulate 2. 

So Qi is a subgroup of the unitary group. Due to Pos- 
tulate 3, it maps some pure product state to an entan- 
gled state. In other words, Q4 contains an entangling 
unitary, and also all local unitaries. It is a well-known 
fact from quantum computation [38] that these transfor- 
mations generate the full unitary group. 

We have thus shown 

Lemma 11. The group of reversible transformations Q' 2 
on two bits corresponds to the unitary conjugations, i.e. 
the maps p 1-4 UpW with U e 517(4). 

It is now clear that what we did for two bits can also be 
done for n bits. Since every Sn is contained in some 52" 
for n large enough, we can use the subspace postulate to 
conclude that every state space 5jv is equivalent to the 
quantum AT-level state space. 



VI. CONCLUSIONS AND OUTLOOK 

We have shown that the Hilbcrt space formalism of 
quantum theory can be reconstructed from four natu- 
ral, information-theoretic postulates. We hope that this 
reconstruction - together with other recent axiomatiza- 
tions [4-8] - contributes to a better understanding of 
quantum theory, and sheds some light on some of the 
mysterious aspects of its formalism, such as the appear- 
ance of complex numbers or unitaries. 

One of the main motivations for this work, as men- 
tioned in the introduction, was to search for "quantum 
theory's closest cousins" : dropping one or two of the 



axioms, and working out the remaining set of theories, 
should yield interesting alternative probabilistic theories 
that are conceptually close to quantum theory, but not 
described by the Hilbert space formalism. These theo- 
ries make different physical predictions [39] that can be 
tested experimentally [40]. 

What is the status of the search for those theories? 
Currently, it seems that there are two natural ways to 
proceed. The first possibility is to drop the subspace pos- 
tulate (Postulate 2), because it is in a way the strongest 
and most complicated postulate. This raises the ques- 
tion what other theories (in addition to quantum theory 
and classical probability theory) have the properties of 
local tomography and pure-state transitivity, given that 
all effects are outcomes of allowed measurements? 

As shown in [41], in the case where the local systems 
are balls and the transformation groups are assumed to 
be continuous, quantum theory is still the only solution 
for two binary systems. In fact, there is currently no 
known example of a theory which satisfies the remaining 
three postulates and is not a part of quantum theory. 
This suggests the conjecture that the results of this paper 
remain basically valid if the subspace axiom is dropped. 

A second possibility is to drop local tomography, i.e. 
Postulate 1. Then it seems that indeed further theories 
appear as solutions, in particular state spaces of Jordan 
algebras [42]. It is an interesting open problem to work 
out this idea rigorously, and to classify all state spaces 
that appear in addition to quantum theory. 
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